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The Hubble relation between distance and redshift is a purely cosmographic relation that depends 
only on the symmetries of a FLRW spacetime, but does not intrinsically make any dynamical 
assumptions. This suggests that it should be possible to estimate the parameters defining the Hubble 
relation without making any dynamical assumptions. To test this idea, we perform a number of inter- 
related cosmographic fits to the Iegacy05 and gold06 supernova datasets. Based on this supernova 
data, the "preponderance of evidence" certainly suggests an accelerating universe. However we would 
argue that (unless one uses additional dynamical and observational information) this conclusion 
is not currently supported "beyond reasonable doubt". As part of the analysis we develop two 
particularly transparent graphical representations of the redshift-distance relation — representations 
in which acceleration versus deceleration reduces to the question of whether the relevant graph slopes 

■ up or down. 

Turning to the details of the cosmographic fits, three issues in particular concern us: First, 
the fitted value for the deceleration parameter changes significantly depending on whether one 

■ performs a \ 2 fit to the luminosity distance, proper motion distance, angular diameter distance, or 
' other suitable distance surrogate. Second, the fitted value for the deceleration parameter changes 

significantly depending on whether one uses the traditional redshift variable z, or what we shall 
argue is on theoretical grounds an improved parameterization y = z/(l + z). Third, the published 
estimates for systematic uncertainties are sufficiently large that they certainly impact on, and to 
^) ' a large extent undermine, the usual purely statistical tests of significance. We conclude that the 

O" 1 supernova data should be treated with some caution. 
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IT") ■ I. INTRODUCTION 

O 

0^ ' From various observations of the Hubble relation, most recently including the supernova data [H, S H, 0, H, @1 1 one 
[ is by now very accustomed to seeing many plots of luminosity distance di, versus redshift z. But are there better 
ways of representing the data? Consider cosmography (cosmokinctics) which is the part of cosmology that proceeds 
by making minimal dynamic assumptions. One keeps the geometry and symmetries of FLRW spacetime, 



ds 2 = -c 2 dt 2 + a(t) 2 {-^-^ + r 2 (c\e 2 + sin 2 6 d0 2 )l , (1) 
[1-Kr J 

at least as a working hypothesis, but does not assume the Friedmann equations (Einstein equations), unless and until 
absolutely necessary. By doing so it is possible to defer questions about the equation of state of the cosmological 
fluid, minimize the number of theoretical assumptions one is bringing to the table, and so concentrate more directly 
on the observational situation. 

In particular, the "big picture" is best brought into focus by performing a global fit of all available supernova data 
to the Hubble relation, from the current epoch at least back to redshift z sa 1.75. Indeed, all the discussion over 
acceleration versus deceleration, and the presence (or absence) of jerk (and snap) ultimately boils down, in a cosmo- 
graphic setting, to doing a finite-polynomial truncated-Taylor series fit of the distance measurements (determined by 
supernovae and other means) to some suitable form of distancc-redshift or distance-velocity relationship. Phrasing 
the question to be investigated in this way keeps it as close as possible to Hubble's original statement of the problem, 
while minimizing the number of extraneous theoretical assumptions one is forced to adopt. For instance, it is quite 
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standard to phrase the investigation in terms of the luminosity distance versus redshift relation 0, @] : 



and its higher-order extension 0, [n| [0, HI] 

1 
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Following the spirit of Hubble's original proposal [2a ]. one could in principle directly fit such a relation to the supernova 
data 0, 0, [H, [H , thereby estimating cosmological parameters (such as Hq, qo, and the jerk jo) without making any 
dynamical assumptions — but there are ways of pre-processing the Hubble relation to make the result (and potential 
problems) stand out in greater clarity [H, [3] ■ 

A central question thus has to do with the choice of the luminosity distance as the primary quantity of interest 
- there are several other notions of cosmological distance that can be used, some of which (we shall see) lead to 
simpler and more tractable versions of the Hubble relation. Furthermore, as will quickly be verified by looking at the 
derivation (see, for example, 0, H, B E3, El, E3] , the standard Hubble law is actually a Taylor series expansion derived 
for small z, whereas much of the most interesting recent supernova data occurs at z > 1. Should we even trust the 
usual formalism for large z > 1? Two distinct things could go wrong: (1) The underlying Taylor series could fail to 
converge. (2) Finite truncations of the Taylor series might be a bad approximation to the exact result. 

In fact, both things happen [l3|, Q]> an d there are good mathematical and physical reasons for this undesirable 
behaviour. Moreover — once one stops to consider it carefully — why should the cosmology community be so focussed 
on using the luminosity distance (or its logarithm, proportional to the distance modulus) and the redshift z as 
the relevant parameters? In principle, in place of luminosity distance oIl{z) versus redshift z one could just as easily 
plot f(d,L, z) versus g(z), choosing f(d,L, z) and g{z) to be arbitrary locally invertible functions, and exactly the same 
physics would be encoded. Suitably choosing the quantities to be plotted and fit will not change the physics, but it 
might improve statistical properties and insight. In particular, as argued in [15, [Til], choosing the y-redshift [defined 
by y = z/(l + z)] will definitely improve the behaviour of the Taylor series. 

By comparing cosmological parameters obtained using multiple different fits of the Hubble relation to different 
distance scales, and different parameterizations of the redshift, we can then assess the robustness and reliability 
of the data fitting procedure. In performing this analysis we had initially hoped to verify the robustness of the 
Hubble relation, and to possibly obtain improved estimates of cosmological parameters such as the deceleration 
parameter andierk parameter, thereby complementing other recent cosmographic and cosmokinetic analyses such 
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1 1 Oil . as well as other analyses that take a sometimes skeptical view of the totality of the observational 
|23| . [2~il ]. The actual results of our current cosmographic fits to the data are considerably more 



ambiguous than we had initially expected, and there are many subtle issues hiding in the simple phrase "fitting the 
data" . 

In the following sections we first introduce the various cosmological distance scales, and the related versions of 
the Hubble relation. Due to technical problems with the z-rcdshift for z > 1 [TH, [lj] we introduce the improved 
y- redshift y — z/(l + z), leading to yet more versions of the Hubble relation. After discussing key features of the 
supernova data, we perform, analyze, and contrast multiple fits to the Hubble relation — providing discussions of 
model-building uncertainties (some technical details being relegated to the appendices) and sensitivity to systematic 
uncertainties. Finally we present our results and conclusions: There is a disturbingly strong model-dependence in the 
resulting estimates for the deceleration parameter. Furthermore, once realistic estimates of systematic uncertainties 
(based on the published data) are budgeted for, it becomes clear that purely statistical estimates of goodness of 
fit are dangerously misleading. While the "preponderance of evidence" certainly suggests an accelerating universe, 
we would argue that this conclusion is not currently supported "beyond reasonable doubt" — the supernova data 
(considered by itself) certainly rather strongly suggests an accelerating universe, but is not sufficient (by itself) to 
allow us to reliably conclude that the universe is accelerating. (If one adds additional theoretical assumptions, such 
as by specifically fitting to a A-CDM model, the situation at first glance looks somewhat better — but this is then 
telling you as much about one's choice of theoretical model as it is about the observational situation.) 

The need for a certain amount of caution in interpreting the observational data can clearly be inferred from a 
dispassionate reading of history. If one compares Hubble's original 1929 version of what is now called the Hubble 
plot [25|| with modern updates (see, for instance Kirshner's review of 2004 [26}), it is clear that the estimated value 
of the Hubble parameter has undergone significant revision over the past 80 years, by almost a factor of 10. Indeed, 
Kirhsner provides a very telling plot of the estimated value of the Hubble parameter as a function of publication 
date [2^. Regarding this last plot, Kirshner is moved to comment [26| : 
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"At each epoch, the estimated error in the Hubble constant is small compared with the subsequent changes 
in its value. This result is a symptom of underestimated systematic errors." 

It is important to realise that the systematic under-estimating of systematic uncertainties is a generic phenomenon 
that cuts across disciplines and sub-fields, it is not a phenomenon that is limited to cosmology. For instance, the 
"Particle Data Group" [http:// p dg.lbl.gov7] m their bi-annual "Review of Particle Properties" publishes fascinating 
plots of estimated values of various particle physics parameters as a function of publication date p7j . These plots 
illustrate an aspect of the experimental and observational sciences that is often overlooked: 

It is simply part of human nature to always think the situation regarding systematic uncertainties is better 
than it actually is — systematic uncertainties are systematically under-reported. 

Apart from the many technical points we discuss in (l3l . fli| , and touch on in the body of the article below, (ranging 
from the appropriate choice of cosmological distance scale, to the most appropriate version of redshift, to the "best" 
way of representing the Hubble law), this historical perspective should also be kept in focus — ultimately the treatment 
of systematic uncertainties will prove to be an important component in estimating the reliability and robustness of 
the conclusions one can draw from the data. 



II. COSMOLOGICAL DISTANCE SCALES 



In cosmology there are numerous different and equally natural definitions of the notion of "distance" between two 
objects or events, whether directly observable or not. For the vertical axis of the Hubble plot, instead of using the 
standard default choice of luminosity distance cZl, let us now consider using one or more of the distance scales in 
Table H 



TABLE I: Cosmological distance scales. 



distance 


name 


relation 


d L 


luminosity distance 








dF 


photon flux distance 


d f 


= d L (l + z) 


-1/2 


dp 


photon count distance 


dp 


= d L (l + z) 


-1 


dQ 


deceleration distance 


dQ 


= d L (l + z) 


-3/2 


d A 


angular diameter distance 


d A 


= d L (l + z) 


-2 



All of these cosmological distance scales are ultimately related to the "distance modulus" : 

fx D = 5 log 10 [d L /(10 pc)] = 5 log 10 [d L /(l Mpc)] + 25, (4) 

and the reasons for the terminology and usefulness of these quantities is more fully explained in [l3|, [l4| • (See also 
Hogg [28j ] for a discussion of the various cosmological distance scales in common use.) 

Notation is not always standardized: Indeed, D'Inverno [2!| uses what is effectively the photon count distance dp 
as his nonstandard definition for luminosity distance. Furthermore, though motivated very differently, the quantity 
dp is equal to Weinberg's definition of proper motion distance 0, and is also equal to Peebles' version of angular 
diameter distance That is: 

dp C^L, D'Inverno — ^proper, Weinberg ^A, Peebles- (5) 

Furthermore 

cU.Peeblcs = (l + z) d A - (6) 

Also note that the distance modulus can be rewritten in terms of traditional stellar magnitudes as 

t^D /^apparent /^absolute- (7) 

The continued use of stellar magnitudes and the distance modulus in the context of cosmology is largely a matter of 
historical tradition, though we shall soon sec that the logarithmic nature of the distance modulus has interesting and 
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useful side effects. Note that we prefer as much as possible to deal with natural logarithms: In a; = ln(10) log 10 x. 
Indeed 



so that 



From the definitions above 



In 10 



ln[d L /(l Mpc)] + 25, 



ln[d £ /(l Mpc)] = ^[ MD -25]. 
5 



(8) 



(9) 



<1l > cLf > dp > d,Q > d,A- 



(10) 



Furthermore these particular distance scales satisfy the property that they converge on each other, and converge on 
the naive Euclidean notion of distance, as z — > 0. 

To simplify subsequent formulae, it is now useful to define the "Hubble distance" 



-"0 



(11) 



(The "Hubble distance" = c/Hp is sometimes called the "Hubble radius", or the "Hubble sphere", or even the 
"speed of light sphere" [SLS] |30| . Sometimes "Hubble distance" is used to refer to the naive estimate d = du z 
coming from the linear part of the Hubble relation and ignoring all higher-order terms — this is definitely not our 
intended meaning.) For the estimate Hq = 73 ^ (km/sec)/Mpc [27j we have 



4100 +lf Q Mpc. 



Furthermore we choose to set 



fcc 2 

ff 2 n 2 
n a 



k d 2 H 



(12) 



(13) 



Note that f^o is a purely cosmographic definition without dynamical content. (Only if one additionally invokes the 
Einstein equations, in the form of the Friedmann equations, does fio have the standard interpretation as the ratio 
of total density to the Hubble density, but we would be prejudging things by making such an identification in the 
current cosmographic framework.) In the cosmographic framework k/a^ is simply the present day curvature of space 
(not spacetime), while d^ 2 = Hq/c 2 is a measure of the contribution of expansion to the spacetime curvature of the 
FLRW geometry. 

III. VERSIONS OF THE HUBBLE LAW 

Versions of the Hubble law are easily calculated for each of these cosmological distance scales. Explicitly [HI]: 



d L (z) = d H zl 1 - - [-1 + qo] z + - [q + 3gg - (j + f2 )] z 2 + 0(z 3 ) 



(14) 



d F (z) = d H J 1 - i<7 z + A- [3 + 10 9o + I2q 2 - 4(j + fio)] ^ + 0(z 3 ) 



(15) 



dp(z) = d H zl 1 - i [1 + q )z+ i [3 + Aq + 3q 2 - {j + fi )] z 2 + 0(z 3 ) L (16) 



d Q (z) = d H zl 1 - 1 [2 + g ] z + A [27 + 22q a + I2q 2 - 4( Jo + Q )] z 2 + 0(z 3 ) I. (17) 
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d A {z) = d H zjl - X - [3 + go] z + X - [12 + 7q + 3q 2 - (jo + flo)] z 2 + 0(z 3 )|. (18) 

If one simply wants to deduce (for instance) the sign of go, then plotting the "photon flux distance" dp versus z would 
be a particularly good test — simply check if the first nonlinear term in the Hubble relation curves up or down. 
In contrast, the Hubble law for the distance modulus is given by the more complicated expression [13| : 

Hd(z) = 25+ I ^^|ln(^/Mpc)+lnz+i[l- go ]z-^[3-10 9o -9q 2 +4(jo + r!o)]z 2 + 0(z 3 )|. (19) 

However, when plotting fijj versus z, most of the observed curvature in the plot comes from the universal, and 
therefore uninteresting, In z term. It is much better to rearrange the above as [13| : 

ln[d L /(z Mpc)] = ^i\ D -25]-lnz 
5 

= In(d H /Mpc) - \ [-1 + g ] z + ^ [-3 + 10g + 9g 2 - 4(j + fio)] z 2 + 0(z 3 ). (20) 

In a similar manner one has 

\n[d F /{z Mpc)] = ^-^-[n D -25\-\nz- iln(l+z) 
5 2 

= In(d H /Mpc) - |g 2 + 24 [ 3 + 10 ?o + 9^ - 4(j + fio)] ^ 2 + 0(z 3 ). (21) 



ln[d P /(z Mpc)] 



In 10 , 



25] -lnz-ln(l + z) 



ln(d H /Mpc) - i [1 + go ] z + 24 [ 9 + 10 9o + 9^o - 4(jo + fio)] ^ 2 + 0(z 3 ). (22) 



ln[d Q /(z Mpc)] = -25] -lnz- ^ln(l + z) 



= ln(d H /Mpc) - 1 [2 ■ 



g ] z + i [15 + 10g + 9g 2 - 4(i + n )] z 2 + 0(z 3 ). (23) 



ln[d A /(z Mpc)] = [^ D -25]-lnz-21n(l + z) 

5 

= ln(d H /Mpc) - \ [3 + g ] z + ^ [21 + 10g + 9g 2 - 4(j + fi )] * 2 + 0(z 3 ). (24) 

These logarithmic versions of the Hubble law have several advantages — fits to these relations are easily calculated in 
terms of the observationally reported distance moduli hd and their estimated statistical uncertainties [3, El S EL HI • 
What message should we take from this discussion? There are many physically equivalent versions of the Hubble 
law, corresponding to many slightly different physically reasonable definitions of distance, and whether we choose to 
present the Hubble law linearly or logarithmically. If one were to have arbitrarily small scatter/error bars on the 
observational data, then the choice of which Hubble law one chooses to fit to would not matter. In the presence of 
significant scatter/uncertainty there is a risk that the quality of the fit might depend strongly on the choice of Hubble 
law one chooses to work with. (And if the resulting values of the deceleration parameter one obtains do depend 
significantly on which distance scale one uses, this is evidence that one should be very cautious in interpreting the 
results.) Note that the two versions of the Hubble law based on "photon flux distance" dp stand out in terms of 
making the deceleration parameter easy to visualize and extract. 
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IV. MORE VERSIONS OF THE HUBBLE LAW 

In [ID, [HI we argued for the usefulness (better convergence properties for the Taylor series over the redshift range 
under consideration) of adopting the y-redshift variable: 

Ao-A e AA 

y = — I = "x - ' ( 25 ) 

^0 <^0 

That is, define y to be the change in wavelength divided by the observed wavelength. Then 

j/ = t— ; z = -, — ■ (26) 

1+2 1 - y 

In the past (of an expanding universe) 

z£(0,oo); y€ (0,1); (27) 

while in the future 

ze(-l,0); ye (-oo,0). (28) 
In terms of this new redshift variable, the "linear in distance" Hubble relations are: 

d L (y) = d H y jl - \ [-3 + q ] y + \ [12 - 5g + 3g 2 - (j + n )] y 2 + 0(y 3 )|. (29) 
d F (y) = d H yl 1 - \ [-2 + go] y + i [27 - Uq + 12q 2 Q - 4(j + n )] y 2 + 0(y 3 ) I . (30) 



= dff J 1 - i [-1 + go] y + I [3 - 2g + 3g 2 - (j + Qo)] V 2 + 0(y 3 )| . (31) 
= <fe wjl - f 2/ + ^ [3 - 2g + 12g 2 - 4(jo + n )] y 2 + 0(y 3 ) j. (32) 

dx(w) = ^ V jl - \ [1 + Qo] V + \ [oo + 3g 2 - (jo + fio)] y 2 + 0(y 3 ) j. (33) 

Note that in terms of the y variable it is the "deceleration distance" d,Q that has the deceleration parameter go 
appearing in the simplest manner. Similarly, the "logarithmic in distance" Hubble relations are fl3l. fl4|: 

ln[d L /(y Mpc)] = i^[ MD -25]-lny 
5 

= ln(d H /Mpc) -l[-3 + q Q }y + ^ [21 - 2q + 9q 2 - 4(j + n )] y 2 + 0(y 3 ). (34) 

ln[d F /(y Mpc)] = ^^ D -25]-lny+iln(l-y) 

= ln(cWMpc) - \ [-2 + q ] y + ^ [l5 - 2g + 9g 2 - 4(j + Sh)] V 2 + 0(y 3 ). (35) 



Hdp/(y Mpc)] 



In 10 



[// D -25]-lny + ln(l-y) 



ln(d ff /M P c) - X - [-1 + g ] y + [9 - 2g + 9g 2 - 4(j + f2 )] y 2 + 0(y 3 ). (36) 



7 



m[d Q /(y Mpc)] = ^[Mo - 25] - Iny + | ln(l - y) 

= ln(d H /Mpc) - i go y + 1 [3 - 2q + 9q 2 Q - 4(j + Jl )] y 2 + 0(y 3 ). (37) 

ln[d A /(j/ Mpc)] = -25]-ln» + 21n(l-y) 

5 

= ln(d H /Mpc) - 1 [1 + go ] y + J- [-3 - 2g + 9<z 2 - 4(j + Ho)] 2/ 2 + 0(y 3 ). (38) 

Again note that the "logarithmic in distance" versions of the Hubble law are attractive in terms of maximizing the 
disentangling between Hubble distance (treated as a "nuisance paramter"), deceleration parameter, and jerk. Now 
having a selection of Hubble laws on hand, we can start to confront the observational data to see what it is capable 
of telling us. 

V. SUPERNOVA DATA 

For the plots below we have used data from the supernova legacy survey (Iegacy05) [HQ and the Riess et. al. "gold" 
dataset of 2006 (gold06) [|. 

A. The Iegacy05 dataset 

The data is available in published form [![, and in a slightly different format, via internet 0. (The differences 
amount to minor matters of choice in the presentation.) The final processed result reported for each 115 of the 
supcrnovac is a redshift z, a luminosity modulus fiB, and an uncertainty in the luminosity modulus. The luminosity 
modulus can be converted into a luminosity distance via the formula 

d L = (1 Mcgaparsec) x io^ B +^«t-25)/5_ ^ 

The reason for the "offset" is that supernovae by themselves only determine the shape of the Hubble relation (i.e., qo, 
jo, etc.), but not its absolute slope (i.e., H$) — this is ultimately due to the fact that we do not have good control of 
the absolute luminosity of the supernovae in question. The offset //offset can be chosen to match the known value of Hq 
coming from other sources. (In fact the data reported in the published article [l[ has already been normalized in this 
way to the "standard value" Hio = 70 (km/ sec) /Mpc, corresponding to Hubble distance d^o = c/H^q = 4283 Mpc, 
whereas the data available on the website [2| has not been normalized in this way — which is why /is as reported on 
the website is systematically 19.308 stellar magnitudes smaller than that in the published article.) 

The other item one should be aware of concerns the error bars: The error bars reported in the published article [l[ 
are photometric uncertainties only — there is an additional source of error to do with the intrinsic variability of the 
supernovae. In fact, if you take the photometric error bars seriously as estimates of the total uncertainty, you would 
have to reject the hypothesis that we live in a standard FLRW universe. Instead, intrinsic variability in the supernovae 
is by far the most widely accepted intcrpctation. Basically one uses the "nearby" dataset to estimate an intrinsic 
variability that makes chi-squared look reasonable. This intrinsic variability of 0.13104 stellar magnitudes 0, [l5[) has 
been estimated by looking at low redshift supernovae (where we have good measures of absolute distance from other 
techniques), and has been included in the error bars reported on the website Q. Indeed 



(uncertainty) website = y (intrinsic variability) 2 + (uncertainty) 2 rticle . (40) 

With these key features of the supernovae data kept in mind, conversion to luminosity distance and estimation of 
scientifically reasonable error bars (suitable for chi-square analysis) is straightforward. 

To orient oneself, figure Q] focuses on the deceleration distance dq(y), and plots \n(dq/[y Mpc]) versus y. Visually, 
the curve appears close to flat, at least out to y ss 0.4, which is an unexpected oddity that merits further investigation 
— since this implies an "eyeball estimate" that go ~ 0. Note that this is not a plot of "statistical residuals" obtained 
after curve fitting — rather this can be interpreted as a plot of "theoretical residuals", obtained by first splitting 
off the linear part of the Hubble law (which is now encoded in the intercept with the vertical axis), and secondly 
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Logarithmic Deceleration distance versus y-redshift using Iegacy05 
0.2 I 1 1 1 1 1 1 
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FIG. 1: The normalized logarithm of the deceleration distance, ln(d,Q/[y Mpc]), as a function of the y-redshift using the Iegacy05 
dataset [l|,@|. 

choosing the quantity to be plotted so as to make the slope of the curve at zero particularly easy to interpret in terms 
of the deceleration parameter. The fact that there is considerable "scatter" in the plot should not be thought of as 
an artifact due to a "bad" choice of variables — instead this choice of variables should be thought of as "good" in 
the sense that they provide an honest basis for dispassionately assessing the quality of the data that currently goes 
into determining the deceleration parameter. Similarly, figure [2] focuses on the photon flux distance dp(z), and plots 
ln(c?i?/[z Mpc]) versus z. Visually, this curve is again very close to flat, at least out to z f=a 0.4. This again gives one 
a feel for just how tricky it is to reliably estimate the deceleration parameter qg from the data. 

B. The gold06 dataset 

Our second collection of data is the gold06 dataset 0]. This dataset contains 206 supernovae (including most but 
not all of the Iegacy05 supernovae) and reaches out considerably further in redshift, with one outlier at z = 1.755, 
corresponding to y = 0.6370. Though the dataset is considerably more extensive it is unfortunately heterogeneous 
- combining observations from five different observing platforms over almost a decade. In some cases full data 
on the operating characteristics of the telescopes used does not appear to be publicly available. The issue of data 
inhomogeneity has been specifically addressed by Nesseris and Perivolaropoulos in [31(. (For related discussion, see 
also [22j.) In the gold06 dataset one is presented with distance moduli and total uncertainty estimates, in particular, 
including the intrinsic dispersion. 

A particular point of interest is that the HST-based high-z supernovae previously published in the gold04 dataset Q 
have their estimated distances reduced by approximately 5% (corresponding to Afijj = 0.10), due to a better under- 
standing of nonlinearitics in the photodetectors. (Changes in stellar magnitude are related to changes in luminosity 
distance via equations ([5]) and @. Explicitly A(lnrfi) = In 10 A/zd/5, so that for a given uncertainty in magnitude 
the corresponding luminosity distances are multiplied by a factor 10 A/ir> / 5 . Then 0.10 magnitudes — > 4.7% w 5%, and 
similarly 0.19 magnitudes — > 9.1%.) 

Furthermore, the authors of Q incorporate (most of) the supernovae in the legacy dataset [l], but do so in 
a modified manner by reducing their estimated distance moduli by A/zd = 0.19 (corresponding naively to a 9.1% 
reduction in luminosity distance) — however this is only a change in the normalization used in reporting the data, 
not a physical change in distance. Based on revised modelling of the light curves, and ignoring the question of overall 
normalization, the overlap between the gold06 and Iegacy05 datasets is argued to be consistent to within 0.5% 



9 



Logarithmic Photon flux distance versus z-redshift using Iegacy05 
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FIG. 2: The normalized logarithm of the photon flux distance, \n(dp/[z Mpc]), as a function of the z-redshift using the Iegacy05 
dataset 0,0. 

The critical point is this: Since one is still seeing sa 5% variations in estimated supernova distances on a two-year 
timcscalc, this strongly suggests that the unmodellcd systematic uncertainties (the so-called "unknown unknowns") 
are not yet fully under control in even the most recent data. It would be prudent to retain a systematic uncertainty 
budget of at least 5% (more specifically, Afip = 0.10), and not to place too much credence in any result that is not 
robust under possible systematic recalibrations of this magnitude. Indeed the authors of Q state: 

• "... we adopt a limit on redshift-dependent systematics to be 5% per Az = 1"; 

• "At present, none of the known, well-studied sources of systematic error rivals the statistical errors presented 
here." 

We shall have more to say about possible systematic uncertainties, both "known unknowns" and "unknown unknowns" 
later in this article. 

To orient oneself, figure [3] again focusses on the normalized logarithm of the deceleration distance cLq (y) as a 
function of y-redshift. Similarly, figure [4] focusses on the normalized logarithm of the photon flux distance cIf(z) as a 
function of z-redshift. Visually, these curves are again very close to flat out to y ~ 0.4 and z w 0.4 respectively, which 
implies an "eyeball estimate" that qo 0. Again, this gives one a feel for just how tricky it is to reliably estimate the 
deceleration parameter go from the data. 

Note the outlier at y = 0.6370, that is, z — 1.755. In particular, observe that adopting the y-rcdshift in place of 
the z-redshift has the effect of pulling this outlier "closer" to the main body of data, thus reducing its "leverage" 
effect on any data fitting one undertakes — apart from the theoretical reasons we have given for preferring the y- 
redshift [HI, [l4j|, (improved convergence behaviour for the Taylor series), the fact that it automatically reduces the 
leverage of high rcdshift outliers is a feature that is considered highly desirable purely for statistical reasons. In 
particular, the method of least-squares is known to be non-robust with respect to outliers. One could implement more 
robust regression algorithms, but they are not as easy and fast as the classical least-squares method. We have also 
implemented least-squares regression against a reduced dataset where we have trimmed out the most egregious high-z 
outlier, and also eliminated the so-called "Hubble bubble" for z < 0.0233 [32|, f3S|. While the precise numerical values 
of our estimates for the cosmological parameters then change, there is no great qualitative change to the points we 
wish to make in this article, nor to the conclusions we will draw. 
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Logarithmic Deceleration distance versus y-redshift using gold06 
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FIG. 3: The normalized logarithm of the deceleration distance, \n{dq/[y Mpc]), as a function of the y-redshift using the gold06 
dataset 

C. Peculiar velocities 

One point that should be noted for both the Iegacy05 and gold06 datasets is the way that peculiar velocities have 
been treated. While peculiar velocities would physically seem to be best represented by assigning an uncertainty to 
the measured rcdshift, in both these datasets the peculiar velocities have instead been modelled as some particular 
function of z-rcdshift and then lumped into the reported uncertainties in the distance modulus. (This feature of 
artificially idealizing the observed redshifts as exact is responsible, later on in our analysis, for the degeneracy of 
the statistical uncertainties in the deceleration parameter.) Working with the y-redshift ab initio might lead one to 
re-assess the model for the uncertainty due to peculiar velocities. We expect such effects to be small and have not 
considered them in detail. 



VI. DATA FITTING: STATISTICAL UNCERTANTIES 

We shall now compare and contrast the results of multiple least-squares fits to the different notions of cosmological 
distance, using the two distinct redshift parameterizations discussed above. Specifically, we use a finite-polynomial 
truncated Taylor series as our model, and perform classical least-squares fits. This is effectively a test of the robustness 
of the data-fitting procedure, testing it for model dependence. For general background information see [HI, HH HE 

In brief, fits were carried out for all five distance surrogates, and for both definitions of redshift, using polynomial 
approximants to the Hubble relation up to 7th-order. The F-test was then used to discard statistically meaningless 
terms, and it was seen that quadratic fits were the best that could meaningfully be adopted. (Note that in a 
cosmographic framework, where one is most closely following the spirit of Hubblc's original methodology [25| . one 
does not have a dynamical model to fit the data to, and the use of least-squares fits to a truncated Taylor series is 
the best one can possibly hope for. Ultimately, however, one should note that the truncated Taylor series method is 
not really a very radical approach, being firmly based in quite standard statistical techniques [35l. l36l. l37l. l38l. l39l. liH] .) 
The results are presented in tables HTVfVl 
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Logarithmic Photon flux distance versus z-redshift using gold06 
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FIG. 4: The normalized logarithm of the photon flux distance, \n(d,F /[z Mpc]), as a function of the z-redshift using the gold06 
dataset 

A. Finite-polynomial truncated- Taylor-series fit 

Working (for purposes of the presentation) in terms of y-redshift, the various distance scales can be fitted to 
finite-length power-series polynomials d(y) of the form 



P(y): d(y)=J2< 



(41) 



where the coefficients a 3 all have the dimensions of distance. In contrast, logarithmic fits are of the form 



P(y): ]n[d(y)/(y Mpc)} =J2^y j 

3=0 



(42) 



where the coefficients bj are now all dimcnsionlcss. By fitting to finite polynomials we are implicitly making the 
assumption that the higher-order coefficients are all exactly zero — this does then implicitly enforce assumptions 
regarding the higher-order time derivatives A m a[dt m for m > n, but there is no way to avoid making at least some 
assumptions of this type [M HE [H, Sz|, H, H, E3| . 
The method of least squares requires that we minimize 



x 2 = j-(Pi-nmV 2 



1=1 



(43) 



where the N data points (yi,Pi) represent the relevant function Pj = f{^D.i,yi) of the distance modulus fiD.i 
at corresponding y-redshift y/, as inferred from some specific supernovae dataset. Furthermore P(yi) is the finite 
polynomial model evaluated at yi. The 07 are the total statistical uncertainty in Pj (including, in particular, intrinsic 
dispersion) . The location of the minimum value of x 2 can be determined by setting the derivatives of \ 2 with respect 
to each of the coefficients dj or bj equal to zero. 

Note that the theoretical justification for using least squares assumes that the statistical uncertainties are normally 
distributed Gaussian uncertainties — and there is no real justification for this assumption in the actual data. Further- 
more if the data is processed by using some nonlinear transformation, then in general Gaussian uncertainties will not 
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remain Gaussian — and so even if the untransformed uncertainties are Gaussian the theoretical justification for using 
least squares is again undermined unless the scatter/uncertainties are small, [in the sense that a <C f"(x)/f'(x)], in 
which case one can appeal to a local linearization of the nonlinear data transformation f(x) to deduce approximately 
Gaussian uncertainties [HI, HH, HE H3, HI, US 53] ■ As we have already seen, in figures [T] HI there is again no real 
justification for this "small scatter" assumption in the actual data — nevertheless, in the absence of any clearly better 
data-fitting prescription, least squares is the standard way of proceeding. More statistically sophisticated techniques, 
such as "robust regression" , have their own distinct draw-backs and, even with weak theoretical underpinning, x 2 
data-fitting is still typically the technique of choice [H, HH, [H, H3, [H, [H, HO] . 

We have performed least squares analyses, both linear in distance and logarithmic in distance, for all of the distance 
scales discussed above, dp, dp, dp, 6q, and dj±, both in terms of z-redshift and y-redshift, for finite polynomials from 
n = 1 (linear) to n = 7 (septic). We stopped at n = 7 since beyond that point the least squares algorithm was found 
to become numerically unstable due to the need to invert a numerically ill-conditioned matrix — this ill-conditioning 
is actually a well-known feature of high-order least-squares polynomial fitting. We carried out the analysis to such 
high order purely as a diagnostic — we shall soon see that the "most reasonable" fits are actually rather low order 
?i = 2 quadratic fits. 

B. x 2 goodness of fit 

A convenient measure of the goodness of fit is given by the reduced chi-square: 

Xl = ^, (44) 

where the factor v = N — n — lis the number of degrees of freedom left after fitting N data points to the n + 1 
parameters. If the fitting function is a good approximation to the parent function, then the value of the reduced 
chi-square should be approximately unity xl ~ !■ If the fitting function is not appropriate for describing the data, 
the value of xl wm be greater than 1. Also, "too good" a chi-square fit (xl < 1) can come from over-estimating 
the statistical measurement uncertainties. Again, the theoretical justification for this test relies on the fact that one 
is assuming, without a strong empirical basis, Gaussian uncertainties [H], HH, HH, [13, HH, [sjj H(| ■ In all the cases we 
considered, for polynomials of order n = 2 and above, we found that xl ~ 1 f° r the Iegacy05 dataset, and xl ~ 0.8 < 1 
for the gold06 dataset. Linear n = 1 fits often gave high values for xl- We deduce that: 

• It is desirable to keep at least quadratic n — 2 terms in all data fits. 

• Caution is required when interpreting the reported statistical uncertainties in the gold06 dataset. 

(In particular, note that some of the estimates of the statistical uncertainties reported in gold06 have themselves 
been determined through statistical reasoning — essentially by adjusting xl to be "reasonable" . The effects of such 
pre-processing become particularly difficult to untangle when one is dealing with a heterogeneous dataset.) 

C. F-test of additional terms 

How many polynomial terms do we need to include to obtain a good approximation to the parent function? 

The difference between two x 2 statistics is also distributed as x 2 ■ I n particular, if we fit a set of data with a fitting 
polynomial of n — 1 parameters, the resulting value of chi-square associated with the deviations about the regression 
X 2 (n — 1) has N — n degrees of freedom. If we add another term to the fitting polynomial, the corresponding value 
of chi-square x 2 ( n ) has N — n — 1 degrees of freedom. The difference between these two follows the x 2 distribution 
with one degree of freedom. 

The F x statistic follows a F distribution with v\ = \ and v-i = N — n — 1, 

_ X 2 (n-l)-X 2 (n) , . 

x X 2 (n)/(N-n- 1)' V ' 

This ratio is a measure of how much the additional term has improved the value of the reduced chi-square. F x should 
be small when the function with n coefficients does not significantly improve the fit over the polynomial fit with n—1 
terms. 

In all the cases we considered, the F x statistic was not significant when one proceeded beyond n = 2. We deduce 
that: 
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• It is statistically meaningless to go beyond n = 2 terms in the data fits. 

• This means that one can at best hope to estimate the deceleration parameter and the jerk (or more precisely 
the combination jo + Oo). There is no meaningful hope of estimating the snap parameter from the current data. 



D. 



Uncertainties in the coefficients aj and bj 



From the fit one can determine the standard deviations o~ a . and for the uncertainty of the polynomial coefficients 
aj or bj . It is the root sum square of the products of the standard deviation of each data point Oi , multiplied by the 
effect that the data point has on the determination of the coefficient aj [34] : 



E 



I L 



daj_ 
dP, 



Similarly the covariance matrix between the estimates of the coefficients in the polynomial fit is 



CljCLk 



E 



2 / daj \ / da k \ 



(46) 



(47) 



Practically, the a aj and covariance matrix o~\. ah are determined as follows |34| : 

• Determine the so-called curvature matrix a for our specific polynomial model, where 



OLjk 



E 



\ (yiY (yi) k 

°7 



(48) 



• Invert the symmetric matrix a to obtain the so-called error matrix e: 

e = a -1 . 



(49) 



The uncertainty and covariance in the coefficients aj is characterized by: 



ejk- 



(50) 



Finally, for any function f{a{) of the coefficients ac 



°t yz.%a, 9a . dak 



(51) 



Note that these rules for the propagation of uncertainties implicitly assume that the uncertainties are in some suitable 
sense "small" so that a local linearization of the functions aj(Pj) and /(aj) is adequate. 
Now for each individual clement of the curvature matrix 



< 



Cijk(z) 



< 



Ctjk(z) 



(1 + Zmax) 2 ' 1 (1 + 



< otj k {y) < a jk (z). 



(52) 



Furthermore the matrices ajk{z) and cijk{y) are both positive definite, and the spectral radius of a(y) is definitely less 
than the spectral radius of a(z). After matrix inversion this means that the minimum eigenvalue of the error matrix 
e(y) is definitely greater than the minimum eigenvalue of e{z) — more generally this tends to make the statistical 
uncertainties when one works with y greater than the statistical uncertainties when one works with z. (However 
this naive interpretation is perhaps somewhat misleading: It might be more appropriate to say that the statistical 
uncertainties when one works with z are anomalously low due to the fact that one has artificially stretched out the 
domain of the data.) 
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E. Estimates of the deceleration and jerk 

For all five of the cosmological distance scales discussed in this article, we have calculated the coefficients bj for the 
logarithmic distance fits, and their statistical uncertainties, for a polynomial of order n = 2 in both the y-redshift and 
z-redshift, for both the Iegacy05 and gold06 datasets. The constant term 60 is (as usual in this context) a "nuisance 
term" that depends on an overall luminosity calibration that is not relevant to the questions at hand. These coefficients 
are then converted to estimates of the deceleration parameter q and the combination (j + f2 ) involving the jerk. A 
particularly nice feature of the logarithmic distance fits is that logarithmic distances are linearly related to the reported 
distance modulus. So assumed Gaussian errors in the distance modulus remain Gaussian when reported in terms of 
logarithmic distance — which then evades one potential problem source — whatever is going on in our analysis it is 
not due to the nonlinear transformation of Gaussian errors. We should also mention that for both the Iegacy05 and 
gold06 datasets the uncertainties in z have been folded into the reported values of the distance modulus: The reported 
values of rcdshift (formally) have no uncertainties associated with them, and so the nonlinear transformation y «-> z 
does not (formally) affect the assumed Gaussian distribution of the errors. (Furthermore, since the logarithms of the 
various distance scales are all related by adding or subtracting known functions of redshift, this massaging of the data 
to formally eliminate uncertainties in redshift has the effect of making the statistical uncertainties in q independent 
of the distance scale chosen.) 

The results are presented in tables ITTHVl Note that even after we have extracted these numerical results there is still 
a considerable amount of interpretation that has to go into understanding their physical implications. In particular 
note that the differences between the various models, (Which distance do we use? Which version of redshift do we use? 
Which dataset do we use?), often dwarf the statistical uncertainties within any particular model. If better quality 
(smaller scatter) data were to become available, then one could hope that the cubic term would survive the -F-test. 
This would have follow-on effects in terms of making the differences between the various estimates of the deceleration 
parameter smaller [l3j . which would give us greater confidence in the reliability and robustness of the conclusions. 

TABLE II: Deceleration and jerk parameters (Iegacy05 dataset, y-redshift). 



distance 


1o 


jo ± ^0 


d L 


-0.47 ±0.38 


-0.48 ±3.53 


dp 


-0.57 ±0.38 


+1.04 ±3.71 


dp 


-0.66 ±0.38 


+2.61 ±3.88 


dQ 


-0.76 ±0.38 


+4.22 ± 4.04 


dA 


-0.85 ±0.38 


+5.88 + 4.20 



With l-cr statistical uncertainties. 



TABLE III: Deceleration and jerk parameters (Iegacy05 dataset, z-redshift). 



distance 


1o 


jo + fio 


dp 


-0.48 ±0.17 


+0.43 ± 0.60 


dp 


-0.56 ±0.17 


+1.16 + 0.65 


dp 


-0.62 ±0.17 


+1.92 + 0.69 


d Q 


-0.69 ±0.17 


+2.69 + 0.74 


dA 


-0.75 ±0.17 


+3.49 ± 0.79 



With l-cr statistical uncertainties. 



The statistical uncertainties in q are independent of the distance scale used because they are linearly related to 
the statistical uncertainties in the parameter b\ 1 which themselves depend only on the curvature matrix, which is 
independent of the distance scale used. In contrast, the statistical uncertainties in (jo ± ^o), while they depend 
linearly on the statistical uncertainties in the parameter 62, depend nonlinearly on c/o and its statistical uncertainty. 
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TABLE IV: Deceleration and jerk parameters (gold06 dataset, y-redshift). 



distance 


9o 


jo + ^0 


dh 


-0.62 ±0.29 


±1.66 ±2.60 


(If 


-0.78 ±0.29 


±3.95 ±2.80 


dp 


-0.94 ±0.29 


±6.35 ±3.00 


dQ 


-1.09 ±0.29 


±8.87 ±3.20 


dA 


-1.25 ±0.29 


±11.5 ± 3.41 



With 1-cr statistical uncertainties. 



TABLE V: Deceleration and jerk parameters (gold06 dataset, z-redshift). 



distance 


9o 


jo ± flo 


d L 


-0.37 ±0.11 


±0.26 ±0.20 


dF 


-0.48 ±0.11 


±1.10 ±0.24 


dp 


-0.58 ±0.11 


±1.98 ±0.29 


dQ 


-0.68 ±0.11 


±2.92 ±0.37 


dA 


-0.79 ±0.11 


±3.90 ±0.39 



With 1-cr statistical uncertainties. 



VII. MODEL-BUILDING UNCERTAINTIES 



The fact that there are such large differences between the cosmological parameters deduced from the different 
models should give one pause for concern. These differences do not arise from any statistical flaw in the analysis, 
nor do they in any sense represent any "systematic" error, rather they are an intrinsic side-effect of what it means 
to do a least-squares fit — to a finite-polynomial approximate Taylor series — in a situation where it is physically 
unclear as to which if any particular measure of "distance" is physically preferable, and which particular notion of 
"distance" should be fed into the least-squares algorithm. (This "feature" — some may call it a "limitation" — of 
the least-squares algorithm in the absence of a clear physically motivated dynamical model is an often overlooked 
confounding factor in data analysis [35L l36l. l37l. l38l. I39I. l40l| .) In appendix lAl we present a brief discussion of the most 
salient mathematical issues. 

The key numerical observations are that the different notions of cosmological distance lead to equally spaced least- 
squares estimates of the deceleration parameter, with equal statistical uncertainties; the reason for the equal-spacing 
of these estimates being analytically explainable by the argument presented in appendix [SJ Furthermore, from the 
results in appendix [A] we can explicitly calculate the magnitude of this modelling ambiguity as 



modelling 



while the corresponding formula for y-rcdshift is 



[A<? 



modelling 



= -1 - 



£4 +J 


-1 


£ z i ln (! - 


\-zi) 




ij 








-1 


£»j ln(l 


- Vi) 






. 1 





(53) 



(54) 



Note that for the quadratic fits we have adopted this requires calculating a x (n+1) matrix, with {i, j} G {0, 1, 2}, 

inverting it, and then taking the inner product between the first row of this inverse matrix and the relevant column 
vector. The Einstein summation convention is implied on the j index. For the z-redshift (if we were to restrict our 
z-redshift dataset to z < 1, e.g., using Iegacy05 or a truncation of gold06) it makes sense to Taylor series expand the 
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logarithm to alternatively yield 



[ Al ?o] modcm ; 



E 

k=n+l 



(-If 



E4 +J 


-1 


E4 +fc 






. I 



(55) 



For the y-redshift we do not need this restriction and can simply write 



[Ago! 



modelling 



oc - 

^ k 

fc=n+l 



E^i 



(56) 



As an extra consistency check we have independently calculated these quantities (which depend only on the redshifts 
of the supernovae) and compared them with the spacing we find by comparing the various least-squares analyses. 
For the n = 2 quadratic fits these formulae reproduce the spacing reported in tables HD IVl As the order n of the 
polynomial increases, it was seen that the differences between deceleration parameter estimates based on the different 
distance measures decreases — unfortunately the size of the purely statistical uncertainties was simultaneously seen 
to increase — this being a side effect of adding terms that are not statistically significant according to the F-test. 

Thus to minimize "model building ambiguities" one wishes the parameter "n" to be as large as possible, while to 
minimize statistical uncertainties, one does not want to add statistically meaningless terms to the polynomial. 

Note that if one were to have a clearly preferred physically motivated "best" distance this whole model building 
ambiguity goes away. In the absence of a clear physically justifiable preference, the best one can do is to combine 
the data as per the discussion in appendix IB! which is based on NIST recommended guidelines [4l|, and report an 
additional model building uncertainty (beyond the traditional purely statistical uncertainty). 

Note that we do limit the modelling uncertainty to that due to considering the five reasonably standard definitions 
of distance dA, 3q, dp, dp, and d^. The reasons for this limitation are partially practical (we have to stop somewhere), 
and partly physics-related (these five definitions of distance have reasonably clear physical interpretations, and there 
seems to be no good physics reason for constructing yet more notions of cosmological distance) . 

Turning to the quantity (jo + ^o)j the different notions of distance no longer yield equally spaced estimates, nor 
are the statistical uncertainties equal. This is due to the fact that there is a nonlinear quadratic term involving qo 
present in the relation used to convert the polynomial coefficient 62 into the more physical parameter (jo + f2o)- Note 
that while for each specific model (choice of distance scale and redshift variable) the F-test indicates that keeping 
the quadratic term is statistically significant, the variation among the models is so great as to make measurements of 
(jo + f2o) almost meaningless. The combined results are reported in tables IVlllVIIl Note that these tables do not yet 
include any budget for "systematic" uncertainties. 



TABLE VI: Deceleration parameter summary: Statistical plus modelling. 



dataset 


redshift 


<10 ± ^statistical ± f modelling 


Iegacy05 


V 


-0.66 ±0.38 ±0.13 


Iegacy05 


z 


-0.62 ±0.17 ±0.10 


gold06 


y 


-0.94 ±0.29 ±0.22 


gold06 


z 


-0.58 ±0.11 ±0.15 



With 1-a statistical uncertainties and 1-a model building uncertainties, 
no budget for "systematic" uncertainties. 



Again, we reiterate the fact that there are distressingly large differences between the cosmological parameters 
deduced from the different models this should give one pause for concern above and beyond the purely formal 
statistical uncertainties reported herein. 



VIII. SYSTEMATIC UNCERTAINTIES 



Beyond the statistical uncertainties and model-building uncertainties we have so far considered lies the issue of 
systematic uncertainties. Systematic uncertainties are extremely difficult to quantify in cosmology, at least when it 
comes to distance measurements — see for instance the relevant discussion in [J, [f| , or in [|| . What is less difficult to 
quantify, but still somewhat tricky, is the extent to which systematics propagate through the calculation. 
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TABLE VII: Jerk parameter summary: Statistical plus modelling. 



dataset 


redshift 


(jo + fio) ± (Tstatistical ± ^modelling 


Iegacy05 


y 


+2.65 ±3.88 ±2.25 


Iegacy05 


z 


±1.94 ±0.70 ±1.08 


gold06 


y 


±6.47 ± 3.02 ± 3.48 


gold06 


z 


±2.03 ±0.31 ±1.29 



With 1-er statistical uncertainties and 1-er model building uncertanties, 
no budget for "systematic" uncertainties. 



A. Major philosophies underlying the analysis of statistical uncertainty 

When it comes to dealing with systematic uncertainties there are two major philosophies on how to report and 
analyze them: 

• Treat all systematic uncertainties as though they were purely statistical and report 1-sigma "effective standard 
uncertainties". In propagating systematic uncertainties treat them as though they were purely statistical and 
uncorrelated with the usual statistical uncertainties. In particular, this implies that one is to add estimated 
systematic and statistical uncertainties in quadrature 



'-'combined y "statistical "systematic ' (57) 

This manner of treating the systematic uncertainties is that currently recommended by NIST [4lj |. this rec- 
ommendation itself being based on ISO, CIPM, and BIPM recommendations. This is also the language 
most widely used within the supernova community, and in particular in discussing the gold05 and Iegacy05 
datasets [H, HI H, EL HI j so we shall standardize our language to follow these norms. 

• An alternative manner of dealing with systematics (now deprecated) is to carefully segregate systematic and 
statistical effects, somehow estimate "credible bounds" on the systematic uncertainties, and then propagate the 
systematics through the calculation — if necessary using interval arithmetic to place "credible bounds" on the 
final reported systematic uncertainty. The measurements results would then be reported as a number with two 
independent sources of uncertainty — the statistical and systematic uncertainties, and within this philosophy 
there is no justification for adding statistical and systematic effects in quadrature. 

It is important to realise that the systematic uncertainties reported in gold05 and Iegacy05 arc of the first type: 
effective equivalent 1-sigma error bars 0, [H, H, EL [1[ ■ These reported uncertainties are based on what in the supernova 
community are referred to as "known unknowns" . 

(The NIST guidelines [4l[ also recommend that all uncertainties estimated by statistical methods should be denoted 
by the symbol s, not a, and that uncertainties estimated by non-statistical methods, and combined overall uncer- 
tainties, should be denoted by the symbol u — but this is rarely done in practice, and we shall follow the traditional 
abuse of notation and continue to use a throughout.) 



B. Deceleration 



For instance, assume we can measure distance moduli to within a systematic uncertainty A/x sys tematic over a redshift 
range A(redshift). If all the measurements are biased high, or all are biased low, then the systematic uncertainty 
would affect the Hubble parameter Hq, but would not in any way disturb the deceleration parameter qo- However 
there may be a systematic drift in the bias as one scans across the range of observed rcdshifts. The worst that could 
plausibly happen is that all measurements arc systematically biased high at one end of the range, and biased low at 
the other end of the range. For data collected over a finite width A (redshift), this "worst plausible" situation leads 
to a systematic uncertainty in the slope of 



A 



dfi 
dz 



2 Afi, 



systematic 



systematic 



A(redshift) 



(58) 
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which then propagates to an uncertainty in the deceleration parameter of 



(^systematic 



2 In 10 



dz 



systematic 



4 In 10 A/i S y S tomatic 

5 A(rcdshift) 



c 



A(redshift) 



(59) 



For the situation we are interested in, if we take at face value the reliability of the assertion "...we adopt a limit on 
redshift-dependent systematics to be 5% per Az = 1" meaning up to 2.5% high at one end of the range and up to 
2.5% low at the other end of the range. A 2.5% variation in distance then corresponds, via Afirj = 5A(ln<iL)/ln 10, to 
an uncertainty A/z sys tematic = 0.05 in stellar magnitude. So, (taking Az = 1), one has to face the somewhat sobering 
estimate that the "equivalent 1-er uncertainty" for the deceleration parameter go is 



^systematic — 0.09. 



(60) 



When working with y-rcdshift, one really should reanalyze the entire corpus of data from first principles — failing 
that, (not enough of the raw data is publicly available), we shall simply observe that 



dz 
dy 



1 



(61) 



and use this as a justification for assuming that the systematic uncertainty in go when using y-redshift is the same as 
when using z-redshift. 

C. Jerk 

Turning to systematic uncertainties in the jerk, the worst that could plausibly happen is that all measurements are 
systematically biased high at both ends of the range, and biased low at the middle, (or low at both ends and high in 
the middle), leading to a systematic uncertainty in the second derivative of 



Ia 

2 







dz 2 _ 


systematic 



A(redshift) 



2A^i 



systematic? 



(62) 



where we have taken the second-order term in the Taylor expansion around the midpoint of the redshift range, and 
asked that it saturate the estimated systematic error 2A/i syst ematic- This implies 



d 2 /i 



dz 2 



16 A/^-systematic 



systematic A(redshift) 2 

which then propagates to an uncertainty in the jerk parameter (jo + f2o) of at least 

'd 2 n] _ 48 In 10 A/z syst 



3 In 10 

^systematic ^ Z ^ 



dz 2 



t cmatic 



systematic 



A (redshift) 2 
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^Msystc 



A (redshift) 5 



(63) 



(64) 



There are additional contributions to the systematic uncertainty arising from terms linear and quadratic in qo . They do 
not seem to be important in the situations we are interested in so we content ourselves with the single term estimated 
above. Using A/i sys tematic = 0.05 and Az = 1 we see that the "equivalent 1-er uncertainty" for the combination 
(jo + H ) is: 



^systematic — 1.11. 



(65) 



Thus direct cosmographic measurements of the jerk parameter are plagued by very high systematic uncertainties. 
Note that the systematic uncertainties calculated in this section are completely equivalent to those reported in [4|. 



IX. HISTORICAL ESTIMATES OF SYSTEMATIC UNCERTAINTY 



We now turn to the question of possible additional contributions to the uncertainty, based on what the NIST 
recommendations call "type B evaluations of uncertainty" — namely "any method of evaluation of uncertainty by 
means other than the statistical analysis of a series of observations" [Uj . (This includes effects that in the supernova 
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community are referred to as "unknown unknowns" , which arc not reported in any of their estimates of systematic 
uncertainty.) 

The key point here is this: "A type B evaluation of standard uncertainty is usually based on scientific judgment 
using all of the relevant information available, which may include: previous measurement data, etc..." [4l| . It is this 
recommendation that underlies what we might wish to call the "historical" estimates of systematic uncertainty — 
roughly speaking, we suggest that in the systematic uncertainty budget it is prudent to keep an extra "historical 
uncertainty" at least as large as the most recent major re-calibration of whatever measurement method you are 
currently using. 

Now this "historical uncertainty" contribution to the systematic uncertainty budget that we are advocating is 
based on 100 years of unanticipated systematic errors ("unknown unknowns") in astrophysical distance scales — from 
Hubble's reliance on mis-calibrated Cephid variables (leading to distance estimates that were about 666% too large), 
to last decade's debates on the size of our owngalaxy (with up to 15% disagreements being common), to last year's 
5% shift in the high-z supernova distances [1, [B| — and various other re-calibration events in between. That is, 5% 
variations in estimates of cosmological distances on a 2 year time scale seem common, 10% on a 10 year time scale, 
and 500% or more on an 80 year timescale. 

(These re-calibrations are of course not all related to supernova measurements, but they arc historical evidence of 
how difficult it is to make reliable distance measurements in cosmology.) Based on the historical evidence we feci that 
it is currently prudent to budget an additional "historical uncertainty" of approximately 5% in the distances to the 
furthest supernovae, (corresponding to 0.10 stellar magnitudes), while for the nearby supernovae we generously budget 
a "historical uncertainty" of 0%, based on the fact that these distances have not changed in the last 2 years [HQ. 

(Some researchers have argued that the present "historical" estimates of uncertainty confuse the notion of "error" 
with that of "uncertainty". We disagree. What we are doing here is to use the most recently detected (significant) 
error to estimate one component of the uncertainty — this is simply a "scientific judgment using all of the relevant 
information available" (4l| . 

By using the most recent major re-calibration as our basis for historical uncertainty we feel we are steering a middle 
course between placing too much versus to little credence in the observational data. 

We should add that other researchers have suggested that we are being too generous above, and that our estimates 
of "historical uncertainty" should be even larger.) 



Deceleration 



This implies 



dz 



istorical 



historical 



A(rcdshift) ' 



(66) 



Note the absence of a factor 2 compared to equation {58]) , this is because in this "historical" discussion we have taken 
the nearby supernovae to be accurately calibrated, whereas in the discussion of systematic uncertainties in equation 
(|58p both nearby and distant supernovae are subject to "known unknown" systcmatics. This then propagates to an 
uncertainty in the deceleration parameter of 



2 In 10 



^"historical 



A 



dfi 

dz 



2 In 10 A/x his 



istorical 



historical 



5 A(redshift) 



0.9 



historical 



(67) 



Noting that a 5% shift in luminosity distance is equivalent to an uncertainty of A/Xhistorical = 
this implies an "equivalent 1-er uncertainty" for the deceleration parameter qo is 

^historical = 0.09. 

This (coincidcntally) is equal to the systematic uncertainties based on "known unknowns" . 



A(redshift) ' 

0.10 in stellar magnitude, 



(68) 



B. Jerk 



Turning to the second derivative a similar analysis implies 



2 



dz 2 



A(redshift)- 2 = Afj, 



historical ■ 



(69) 



historical 
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Note the absence of various factors of 2 as compared to equation (|62|) . This is because we are now assuming that 
for "historical" purposes the nearby supernovae are accurately calibrated and it is only the distant supernovae that 
are potentially uncertain — thus in estimating the historical uncertainty the second-order term in the Taylor series is 
now to be saturated using the entire redshift range. Thus 



A 



7< 



dz 2 



2 A/i 



historical 



historical 



A (redshift) 2 ' 



which then propagates to an uncertainty in the jerk parameter of at least 
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orical 



historical 



5 A (redshift) 2 



2.75 



A/i h 



istorical 



A(redshift) 2 



Again taking A/Xhistoricai = 0.10 this implies an "equivalent 1-a uncertainty" for the combination jo + do is 

"historical = 0.28. 



(70) 



(71) 



(72) 



Note that this is (coincidentally) one quarter the size of the systematic uncertainties based on "known unknowns" , 
and is still quite sizable. 

The systematic and historical uncertainties are now reported in tables IVIIIlUXl The estimate for systematic 
uncertainties are equivalent to those presented in Q , which is largely in accord with related sources 0, 0, S] ■ Our 
estimate for "historical" uncertainties is likely to be more controversial — with several cosmologists arguing that our 
estimates are too generous — and that ^historical should perhaps be even larger than we have estimated. What is not 
(or should not) be controversial is the need for some estimate of ^historical ■ Previous history should not be ignored, and 
as the NIST guidelines emphasize, previous history is an essential and integral part of making the scientific judgment 
as to what the overall uncertainties are. 

TABLE VIII: Deceleration parameter summary: Statistical, modelling, systematic, and historical. 



dataset 


redshift 


QO i "statistical i "modelling i "systematic + "historical 


Iegacy05 


y 


-0.66 ± 0.38 ± 0.13 ± 0.09 ± 0.09 


Iegacy05 


z 


-0.62 ± 0.17 ± 0.10 ± 0.09 ± 0.09 


gold06 


y 


-0.94 ± 0.29 ± 0.22 ± 0.09 ± 0.09 


gold06 


z 


-0.58 ± 0.11 ± 0.15 ± 0.09 ± 0.09 



With l-<7 effective statistical uncertainties for all components. 



TABLE IX: Jerk parameter summary: Statistical, modelling, systematic, and historical. 



dataset 


redshift 


(jo + f^u) i "statistical i ""modelling i f systematic i (^historical 


Iegacy05 


y 


+2.65 ± 3.88 ± 2.25 ± 1.11 ± 0.28 


Iegacy05 


z 


+1.94 ± 0.70 ± 1.08 ± 1.11 ± 0.28 


gold06 


y 


+6.47 ± 3.02 ± 3.48 ± 1.11 ± 0.28 


gold06 


z 


+2.03 ± 0.31 ± 1.29 ± 1.11 ± 0.28 



With l-<7 effective statistical uncertainties for all components. 



X. COMBINED UNCERTAINTIES 



We now combine these various uncertainties, purely statistical, modelling, "known unknown" systematics, and 
"historical" ("unknown unknowns"). Adopting the NIST philosophy of dealing with systematics, these uncertainties 
arc to be added in quadrature (4l[. Including all 4 sources of uncertainty we have discussed: 



^combined — \ ""statistical + ^modelling + ^systematic + ^historical ' (^) 
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That the statistical and modelling uncertainties should be added in quadrature is clear from their definition. Whether 
or not systematic and historical uncertainties should be treated this way is very far from clear, and implicitly presup- 
poses that there are no correlations between the systematics and the statistical uncertainties — within the "credible 
bounds" philosophy for estimating systematic uncertainties there is no justification for such a step. Within the "all 
errors are effectively statistical" philosophy adding in quadrature is standard and in fact recommended — this is what 
is done in current supernova analyses, and we shall continue to do so here. The combined uncertainties cr C ombined are 
reported in tables IXl IXIl 

XI. EXPANDED UNCERTAINTY 

An important concept under the NIST guidelines is that of "expanded uncertainty" 

Uk = k (Jcombmcd- (74) 

Expanded uncertainty is used when for cither scientific or legal/regulatory reasons one wishes to be "certain" that 
the actual physical value of the quantity being measured lies within the stated range. We shall take k = 3, this being 
equivalent to the well-known particle physics aphorism "if it's not three-sigma, it's not physics". Note that this is 
not an invitation to randomly multiply uncertainties by 3, rather it is a scientific judgment that if one wishes to be 
99.5% certain that something is or is not happening one should look for a 3-sigma effect. Bitter experience within the 
particle physics community has led to the consensus that 3-sigma is the minimum standard one should look for when 
claiming "new physics" . (In fact, there is now a growing consensus in the particle physics community that 5-sigma 
should be the new standard for claiming "new physics" [42J.) Thus we take 

U3 = 3 (Jcombmcd- (75) 

The best estimates, combined uncertainties <J com bi ne d, and expanded uncertainties U, are reported in tables IXTlXIl 
TABLE X: Deceleration parameter summary: Combined and expanded uncertainties. 



dataset 


redshift 


qO ± tJcombinod 


qo ± U 3 


Iegacy05 


V 


-0.66 ±0.42 


-0.66 ± 1.26 


Iegacy05 


z 


-0.62 ±0.23 


-0.62 ±0.70 


gold06 


y 


-0.94 ±0.39 


-0.94 ± 1.16 


gold06 


z 


-0.58 ±0.23 


-0.58 ±0.68 



TABLE XI: Jerk parameter summary: Combined and expanded uncertainties. 



dataset 


redshift 


(jo ± flo ) ± a combined 


(j + n Q )±u 3 


Iegacy05 


y 


+2.65 ± 4.63 


±2.65 ± 13.9 


Iegacy05 


z 


±1.94 ± 1.72 


±1.94 ±5.17 


gold06 


y 


±6.47 ± 4.75 


±6.47 ± 14.2 


gold06 


z 


±2.03 ± 1.75 


±2.03 ± 5.26 



XII. RESULTS 

What can we conclude from this? While the "preponderance of evidence" is certainly that the universe is currently 
accelerating, go < 0, this is not yet a "gold plated" result. We emphasise the fact that (as is or should be well known) 
there is an enormous difference between the two statements: 

• "the most likely value for the deceleration parameter is negative" , 

and 



22 



• "there is significant evidence that the deceleration parameter is negative" . 

When it comes to assessing whether or not the evidence for an accelerating universe is physically significant, the first 
rule of thumb for combined uncertainties is the well known aphorism "if it's not three-sigma, it's not physics". The 
second rule is to be conservative in your systematic uncertainty budget. 

Based on the supernovae alone it is more likely that the expansion of the universe is accelerating, than that the 
expansion of the universe is decelerating — but this is a very long way from having "gold plated" evidence in favour of 
acceleration. The summary table regarding the jerk parameter, or more precisely the combination (jo + ^o), is rather 
grim reading, and indicates the need for considerable caution in interpreting the supernova data. Note that while 
use of the y-redshift may improve the theoretical convergence properties of the Taylor series, and will not affect the 
uncertainties in the distance modulus or the various distance measures, it does seem to have an unfortunate side-effect 
of magnifying statistical uncertainties for the cosmological parameters. 

As previously mentioned, we have further checked the robustness of our analysis by first excluding the outlier at 
z = 1.755, then excluding the so-called "Hubble bubble" at z < 0.0233 [32|,[33||, and then excluding both — the precise 
numerical estimates for the cosmological parameters certainly change, but the qualitative picture remains as we have 
painted it here. 

XIII. CONCLUSIONS 

Why do our conclusions seem to be so much at variance with currently perceived wisdom concerning the acceleration 
of the universe? The main reasons are twofold: 

• Instead of simply picking a single model and fitting the data to it, we have tested the overall robustness of the 
scenario by encoding the same physics (Hq, qq, jo) in multiple different ways {dp, dp, dp, dQ, dA', using both z 
and y) to test the robustness of the data fitting procedures. 

• We have been much more explicit, and conservative, about the role of systematic uncertainties, and their effects 
on estimates of the cosmological parameters. 

If we only use the statistical uncertainties and the "known unknowns" added in quadrature, then the case for cos- 
mological acceleration is much improved, and is (in some cases we study) "statistically significant at three-sigma" , 
but this does not mean that such a conclusion is either robust or reliable. (By "cherry picking" the data, and the 
particular way one analyzes the data, one can find statistical support for almost any conclusion one wants.) 

The modelling uncertainties we have encountered depend on the distance variable one chooses to do the least squares 
fit (dp, dp, dp, dQ, <Ia). There is no good physics reason for preferring any one of these distance variables over the 
others. One can always minimize the modelling uncertainties by going to a higher-order polynomial unfortunately 
at the price of unacceptably increasing the statistical uncertainties — and we have checked that this makes the overall 
situation worse. This does however suggest that things might improve if the data had smaller scatter and smaller 
statistical uncertainties: We could then hope that the F-test would allow us to go to a cubic polynomial, in which 
case the dependence on which notion of distance we use for least-squares fitting should decrease. 

We wish to emphasize the point that, regardless of one's views on how to combine formal estimates of 
uncertainty, the very fact that different distance scales yield data-fits with such widely discrepant values 
strongly suggests the need for extreme caution in interpreting the supernova data. 

Though we have chosen to work on a cosmographic framework, and so minimize the number of physics assump- 
tions that go into the model, we expect that similar modelling uncertainties will also plague other more traditional 
approaches. (For instance, in the present-day consensus scenario there is considerable debate as to just when the uni- 
verse switches from deceleration to acceleration, with different models making different statistical predictions [43l|.) 
One lesson to take from the current analysis is that purely statistical estimates of error, while they can be used to 
make statistical deductions within the context of a specific model, are often a bad guide as to the extent to which 
two different models for the same physics will yield differing estimates for the same physical quantity. 

There are a number of other more sophisticated statistical methods that might be applied to the data to possibly 
improve the statistical situation. For instance, ridge regression, robust regression, and the use of orthogonal polyno- 
mials and loess curves. However one should always keep in mind the difference between accuracy and precision [341 ]. 
More sophisticated statistical analyses may permit one to improve the precision of the analysis, but unless one can 
further constrain the systematic uncertainties such precise results will be no more accurate than the current situation. 
Excessive refinement in the statistical analysis, in the absence of improved bounds on the systematic uncertainties, is 
counterproductive and grossly misleading. 
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However, we are certainly not claiming that all is grim on the cosmological front — and do not wish our views to 
be misinterpreted in this regard — there are clearly parts of cosmology where there is plenty of high-quality data, 
and more coming in, constraining and helping refine our models. But regarding some specific cosmological questions 
the catch cry should still be "Precision cosmology? Not just yet" (44| . 

In particular, in order for the current technique to become a tool for precision cosmology, we would need more 
data, smaller scatter in the data, and smaller uncertainties. For instance, by performing the .F-test we found that it 
was almost always statistically meaningless to go beyond quadratic fits to the data. If one can obtain an improved 
dataset of sufficient quality for cubic fits to be meaningful, then ambiguities in the deceleration parameter are greatly 
suppressed. 

In closing, we strongly encourage readers to carefully contemplate figures HHU as an inoculation against over- 
interpretation of the supernova data. In those figures we have split off the linear part of the Hubble law (which is 
encoded in the intercept) and chosen distance variables so that the slope (at redshift zero) of whatever curve one 
fits to those plots is directly proportional to the acceleration of the universe (in fact the slope is equal to —go/2). 
Remember that these plots only exhibit the statistical uncertainties. Remembering that we prefer to work with 
natural logarithms, not stellar magnitudes, one should add systematic uncertainties of ±[ln(10)/5] x (0.05) ss 0.023 to 
these statistical error bars, presumably in quadrature. Furthermore a good case can be made for adding an additional 
"historical" uncertainty, using the past history of the field to estimate the "unknown unknowns" . 

Ultimately however, it is the fact that figures ^° n °t exhibit any overwhelmingly obvious trend that 
makes it so difficult to make a robust and reliable estimate of the sign of the deceleration parameter. 



APPENDIX A: SOME AMBIGUITIES IN LEAST-SQUARES FITTING 

Let us suppose we have a function f(x), and want to estimate f(x) and its derivatives at zero via least squares. 
For any g{x) we have a mathematical identity 



f(x) = [f(x)-g(x)]+g(x), 



and for the derivatives 



f {m) (o) = [f-gr n, (o) + g (m, (o) 



(m)/ 



S m )l 



(Al) 



(A2) 



Adding and subtracting the same function g{x) makes no difference to the underlying function f(x), but it may 
modify the least squares estimate for that function. That is: Adding and subtracting a known function to the data 
does not commute with the process of performing a finite-polynomial least-squares fit. Indeed, let us approximate 



[f(x) - g{x)] = b f-g^ xi + e - 

i=0 



(A3) 



Then given a set of observations at points (fj,xj) we have (in the usual manner) the equations (for simplicity of the 
presentation all statistical uncertainties a are set equal for now) 



where we want to minimize 



i=0 



X 



(A4) 



(A5) 



This leads to 



whence 



(A6) 



i=0 



°f-9,i 



+3 



^[fi - g(xi)Wi, 



(A7) 
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where the square brackets now indicate an (n + 1) x (n + 1) matrix, and there is an implicit sum on the j index as 
per the Einstein summation convention. But we can re-write this as 



b f-g,i = b f,i 



L / 



(A8) 



relating the least-squares estimates of bf t i and bf- g ^. Note that by construction i < n. If we now use this to estimate 
/W(0), we see: 



fl?- g]+g (°) = W)+9 W (0), 



_ m 



(0, 



(A9) 



whence 



/i 



[f-a]+g 



(0) = /W(0) -i! 



E x / 



L J 



X)b(^)]4+fl (i) (o), 



(A10) 



where /^(O) is the "naive" estimate of /^(O) obtained by simply fitting a polynomial to / itself, and f!j_ g i +g (0) is 
the "improved" estimate obtained by first subtracting g(x), fitting f(x) — g{x) to a polynomial, and then adding g{x) 
back again. Note the formula for the shift of the estimate of the ith derivative of f(x) is linear in the function g(x) 
and its derivatives. In general this is the most precise statement we can make — the process of finding a truncated 
Taylor series simply does not commute with the process of performing a least squares fit. 
We can gain some additional insight if we use Taylor's theorem to write 



7(z) = E y —^ ± x k 



fc! 



E 



s w (o) , 



fc=0 k=0 k=n+l 

where we temporarily suspend concerns regarding convergence of the Taylor series. Then 



(All) 



f!f- g]+g W = / w (0)+ 5 W(0)-i 



- m 



gW(0) 



.fc=0 



fc=n+l 



So 



/[ ( ;L g]+g (0) = /W(0)+ 5 W(0)-i 



(»)/ 



E x / 



,fc=0 I k=n+l 



,(fc) 



(0) 



fc! 



whence 



(A12) 



= / (i) (o)+ 9 « ( o)-i!^ 5 (0) 



fe=0 



fc! 



E x / 



L I 



E4 



+fc 



« E 

fc=n+l 



g (fc) (Q) 

fc! 



E^ J 



L J 



E4 



+fe 



But two of these matrices are simply inverses of each other, so in terms of the Kronecker delta 

-i -l 



/f;». 8l+8 (o) - mo) + s "Ho)-«f^s lk -,. £ «*>«> 



fe=0 



k=n+l 



fc! 



E^ J 



which now leads to significant cancellations 



/^, +8 (0) = /"»(0)-.i t 



fe=n+l 



-i -1 



E^ J 



£4 



(A14) 



(A15) 



(A16) 



This is the best (ignoring convergence issues) that one can do in the general case. Note the formula for the shift of 
the estimate of the ith derivative of f(x) is linear in the derivatives of the function g(x), and that it starts with the 
(n + l)th derivative. Consequently as the order n of the polynomial used to fit the data increases there are fewer 
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terms included in the sum, so the difference between various estimates of the derivatives becomes smaller as more 
terms are added to the least squares fit. 

In the particular situation we discuss in the body of the article 



f(x) -> /x = In 



d{z) 
z Mpc 



; g{x) 



K 



ln(l + z); KeZ: 



or a similar formula in terms of the y-redshift. Consequently, from equation (j A10|) . particularized to our case 

-l 



/4 (0) = M (0) + y [ln(l + z)]«(0) - — 



E^ 



Mi + */) 



(A17) 



(A18) 



Then the "gap" between any two adjacent estimates for jl K (0) corresponds to taking AK = 1 and so 

-l 



' w 2 2 



But then for the particular case i = 1 which is of most interest to us 

-i -l 



(0) = M (0) + y - y 



^ ln(l + zj) 



^zj ln(l + z 7 ) 



and 



a£ (1) (o) 



1 i 

2 _ 2 



E*J 



£z^ ln(l + ^) 



By Taylor series expanding the logarithm, and reindexing the terms, this can also be recast as 

n -l 

7+fe 



Mif (o) = m (o) + — 2^ - — 



E4 



E^ H 



whence 



££'(o) = /r'(o) + =i E 
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E4 



y 1 



and 
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(-1)* 



2 ^ fc 
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E4 



'E^ 



(A19) 



(A20) 



(A21) 



(A22) 



(A23) 



(A24) 



(Because of convergence issues, if we work with z-redshift these last three formulae make sense only for supernovae 
datasets where we restrict ourselves to z/ < 1, working in y-redshift no such constraint need be imposed.) Now 
relating this to the modelling ambiguity in q , we have 



[Ago! 



modelling 



-2AA ( >), 



so that 



[Ago 



modelling 



-l 



E4 



+j 



E z / ln d + zi) 



(A25) 



(A26) 



By Taylor-series expanding the logarithm, modulo convergence issues discussed above, this can also be expressed as: 

(~l) fe 



[A?o] modelling - ~ E 



k=n+l 



E4 +J 




E4 +fc 






. i 



(A27) 
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In particular, without further calculation, these results collectively tell us that the different estimates for qg will always 
be evenly spaced, and it suggests that as n — ► oo the differences will become smaller. This is actually what is seen 
in the data analysis we performed. If we were to have a good physics reason for choosing one particular definition of 
distance as being primary, we would use that for the least squares fit, and the other ways of estimating the derivatives 
would be "biased" — but in the current situation we have no physically preferred "best" choice of distance variable. 



APPENDIX B: COMBINING MEASUREMENTS FROM DIFFERENT MODELS 

Suppose one has a collection of measurements X a , each of which is represented by a random variable X a with mean 
/i a = E(X a ) and variance a 2 = E([X a — /i a ] 2 )- How should one then combine these measurements into an overall 
"best estimate"? 

If we have no good physics reason to reject one of the measurements then the best we can do is to describe the 
combined measurement process by a random variable X * where A is now a discrete random variable that picks one 
of the measurement techniques with some probability p a . More precisely 

Prob(i = a) =p a , (Bl) 

where the values p a are for now left arbitrary. Then 

fi = E(X A ) = J2PaE(X a )^Y,P-l 1 ^ ( B2 ) 

a a 

and 

E &\) = J2p- = X>» ^ + £)■ (^3) 

a a 

E{X\) = o 2 +n\ (B4) 

a 

a 2 =Y J Pa<y 2 a +Y J Pa{^a- tif- (B6) 

a a 

This lets us split the overall variance into the contribution from the purely statistical uncertanties on the individual 
measurements 



But equally well 



so that overall 



and 



^statistical = \j^L Pa ( B7 ) 

plus the "modelling ambiguity" arising from different ways of modelling the same physics 



^modelling = ]J^2 Pa ^ a ~ ^ ' ( B8 ) 

In the particular case we are interested in we have 5 different ways of modelling distance and no particular reason for 
choosing one definition of measurement over all the others so it is best to take p a = 1/5. 

Furthermore in the case of the estimates for the deceleration parameter, all individual estimates have the same 
statistical uncertainty, and the estimates are equally spaced with a gap A: 

o- a = a ; /i n = /ip+nA; n6 {-2,-1,0,1,2}. (B9) 

Therefore 

fi = /lp; er stat istical = a 0] ^modelling = V% A. (BIO) 
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For estimates of the jerk, we no longer have the simple equal-spacing rule and equal statistical uncertainties rule, but 
there is still no good reason for preferring one distance surrogate over all the others so we still take p a = 1/5 and the 
estimate obtained from the combined measurements satisfies 



/« — g , ^statistical — V g , <7 mo dclling — V g ■ l^ 11 ! 

These formulae are used to calculate the statistical and modelling uncertainties reported in tables [Vll I VIII and I Villi llXl 
. Note that by definition the combined purely statistical and modelling uncertainties are to be added in quadrature 



statistical modelling 

This discussion does not yet deal with the estimated systematic uncertainties ("known unknowns") or "historically 
estimated" systematic uncertainties ("unknown unknowns"). 
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